Efficient top-k processing in large-scaled distributed environments

نویسندگان

Keping Zhao

Yufei Tao

Shuigeng Zhou

چکیده

The rapid development of networking technologies has made it possible to construct a distributed database that involves a huge number of sites. Query processing in such a large-scaled system poses serious challenges beyond the scope of traditional distributed algorithms. In this paper, we propose a new algorithm BRANCA for performing top-k retrieval in these environments. Integrating two orthogonal methodologies ‘‘semantic caching’’ and ‘‘routing indexes’’, BRANCA is able to solve a query by accessing only a small number of servers. Our algorithmic findings are accompanied with a solid theoretical analysis, which rigorously proves the effectiveness of BRANCA. Extensive experiments verify that our technique outperforms the existing methods significantly. 2007 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Top-k Query Processing Algorithms in Highly Distributed Environments

Efficient top-k query processing in highly distributed environments is a valuable but challenging research topic. This paper focuses on the problem over vertically partitioned data and aims to propose more efficient algorithms.. The effort is put on limiting the data transferred and communication round trips among nodes to reduce the communication cost of the query processing. Two novel algorit...

متن کامل

Top-k aggregation queries in large-scale distributed systems

Distributed top-k query processing has become an essential functionality in a large number of emerging application classes like Internet traffic monitoring and Peer-to-Peer Web search. This work addresses efficient algorithms for distributed topk queries in wide-area networks where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers.

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

Processing Top-k Queries in Distributed Hash Tables

Distributed Hash Tables (DHTs) provide a scalable solution for data sharing in large scale distributed systems, e.g. P2P systems. However, they only provide good support for exact-match queries, and it is hard to support complex queries such as top-k queries. In this paper, we propose a family of algorithms which deal with efficient processing of top-k queries in DHTs. We evaluated the performa...

متن کامل

Efficient Processing of Preference Queries in Distributed and Spatial Databases

Traditional SQL queries are recognized for producing an exact and complete result set. However, for an increasing number of applications that manage massive amounts of data, the large result set produced by traditional SQL queries has become difficult to handle. Therefore, there is an increasing interest in queries that produce a more concise result set. Preference queries capture the wishes of...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Data Knowl. Eng.

دوره 63 شماره

صفحات -

تاریخ انتشار 2007

Efficient top-k processing in large-scaled distributed environments

نویسندگان

چکیده

منابع مشابه

Efficient Top-k Query Processing Algorithms in Highly Distributed Environments

Top-k aggregation queries in large-scale distributed systems

E2DR: Energy Efficient Data Replication in Data Grid

Processing Top-k Queries in Distributed Hash Tables

Efficient Processing of Preference Queries in Distributed and Spatial Databases

عنوان ژورنال:

اشتراک گذاری